Mark Adams
Division of Psychiatry
mark.adams@ed.ac.uk
Genetics and Environmental Influences on Behaviour and Mental Health
Common: Affects 1% or more of the population
Complex: Inheritance cannot be explained by a single gene
Why use genetics to study mental health and psychiatric disorders?
Diagram showing the seven “characters” observed by Mendel
Adding up effects from a large number of genetic effects to make a continuous phenotype is related to the Central Limit Theorem.
Proportion of similarity in phenotypes that can be attributed to similarity in genotypes.
Model: Phenotype (P) = Genotype (G) + Environment (E)
Variance decomposition \[\mathrm{var}(P) = \mathrm{var}(𝐺) + \mathrm{var}(𝐸)\] Proportion of variance \[h^2 = \frac{\mathrm{var}(𝐺)}{\mathrm{var}(𝑃)}\]
Plot of child (offspring) height versus the average of their parents’ heights. What is a statistic that can be used to summarise the relationship between these two variables?
\(\beta = \frac{\mathrm{cov}(A, B)}{\mathrm{var}(A)}\)
Estimate the beta coefficient (slope) for a simple regression from the covariance between predictor (\(A\)) and outcome (\(B\)) variable divided by the variance of the predictor (\(A\)).
\[ P = G + E \]
The phenotype value \(P\) is influenced by a genetic effect \(G\) and and environmental effect \(E\).
\[ G = d + s \]
Each individual has two copies of the genome, one inherited from each parent.
Phenotype (\(P\)) value is the sum of the two genetic values plus an environmental value (\(e\)).
\(\beta = \frac{\mathrm{cov}(A, B)}{\mathrm{var}(A)}\)
Therefore, \(\beta = \frac{\mathrm{cov}(\frac{P_d + P_s}{2}, P_o)}{\mathrm{var}(\frac{P_d + P_s}{2})}\)
\[ \mathrm{cov}(\frac{P_d + P_s}{2}, P_o) \]
\[ = \mathrm{cov}(\frac{d + d^\prime + e_d + s + s^\prime + e_s}{2}, d + s + e_o) \]
Expand the terms. Recall that:
\[ \mathrm{cov}(A+X,B+Y) = \\ \mathrm{cov}(A,B) + \mathrm{cov}(A,Y) + \mathrm{cov}(X,B) + \mathrm{cov}(X,Y) \] Thus we can do a pairwise expansion to: \[ = \mathrm{cov}(\frac{d}{2} + \frac{d^\prime}{2} + \frac{e_d}{2} + \frac{s}{2} + \frac{s^\prime}{2} + \frac{e_s}{2}, d + s + e_o) \] \[ = \mathrm{cov}(\frac{d}{2}, d) + \mathrm{cov}(\frac{d^\prime}{2}, d) + \dotsm+ \mathrm{cov}(\frac{e_s}{2}, e_o)\] $$
Some terms can be simplified.
Covariance between a genetic effect and itself \[ \mathrm{cov}(\frac{d}{2}, d), \mathrm{cov}(\frac{s}{2}, s) \]
Simplifies to:
\[ \mathrm{cov}(\frac{d}{2}, d) = \frac{1}{2}\mathrm{cov}(d, d) = \frac{1}{2}\mathrm{var}(d) \] \[ \mathrm{cov}(\frac{s}{2}, s) = \frac{1}{2}\mathrm{cov}(s, s) = \frac{1}{2}\mathrm{var}(s) \]
For some terms we might make an assumption that they are equal to 0.
Covariance between genetic effects from the same parent \[ \mathrm{cov}(\frac{d^\prime}{2}, d), \mathrm{cov}(\frac{s^\prime}{2}, s) \]
Covariance between genetic effects from different parents \[ \mathrm{cov}(\frac{d^\prime}{2}, s), \mathrm{cov}(\frac{s^\prime}{2}, d) \]
Covariance between parent and offspring environment effects \[ \mathrm{cov}(\frac{e_d}{2}, e_o), \mathrm{cov}(\frac{e_s}{2}, e_o) \]
Covariance between parental genetic and offspring environmental effects \[ \mathrm{cov}(\frac{d}{2}, e_o), \mathrm{cov}(\frac{s}{2}, e_o) \]
Using those assumptions the parent–offspring covariance simplifies to
\[ \mathrm{cov}(\frac{P_d + P_s}{2}, P_o) = \frac{\mathrm{var}(d) + \mathrm{var}(s)}{2} \]
The denominator in the regression equation was \[ \mathrm{var}(\frac{P_d + P_s}{2}) \]
Using the identity \[ \mathrm{var}(aX + bY) = a^2\mathrm{var}(X) + b^2\mathrm{var}(Y) + 2ab\mathrm{cov}(X, Y) \] the variance of the average parental phenotypes is: \[ \mathrm{var}(\frac{P_d + P_s}{2}) = \mathrm{var}(\frac{1}{2}P_d + \frac{1}{2} P_s) \] \[ = \left(\frac{1}{2}\right)^2\mathrm{var}(P_d) + \left(\frac{1}{2}\right)^2\mathrm{var}(P_s) + 2 \cdot \frac{1}{2} \cdot \frac{1}{2} \mathrm{cov}(P_d, P_s) \] \[ = \frac{1}{4}\mathrm{var}(P_d) + \frac{1}{4}\mathrm{var}(P_s) + \frac{1}{2} \mathrm{cov}(P_d, P_s) \]
If we assume as above that there is no covariation between parental effects (\(\mathrm{cov}(P_d, P_s) = 0\)), this simplifies to
\[ = \frac{\mathrm{var}(P_d) + \mathrm{var}(P_s)}{4} \]
Thus the regression equation is:
\[ \beta = \frac{\mathrm{cov}(\frac{P_d + P_s}{2}, P_o)}{\mathrm{var}(\frac{P_d + P_s}{2})} \\ = \frac{\frac{\mathrm{var}(d) + \mathrm{var}(s)}{2}}{\frac{\mathrm{var}(P_d) + \mathrm{var}(P_s)}{4}} \\ = 2\frac{\mathrm{var}(d) + \mathrm{var}(s)}{\mathrm{var}(P_d) + \mathrm{var}(P_s)} \]
Previously we defined
\[ G = d + s \] thus \[ \mathrm{var}(G) = \mathrm{var}(d) + \mathrm{var}(s) \] and assume variances in parental phenotypes are equal \[ \mathrm{var}(P_d) = \mathrm{var}(P_s) = \mathrm{var}(P) \]
Then substitute into the regression equation
\[ \beta = 2\frac{\mathrm{var}(d) + \mathrm{var}(s)}{\mathrm{var}(P_d) + \mathrm{var}(P_s)} \\ = 2 \frac{\mathrm{var}(G)}{\mathrm{var}(P) + \mathrm{var}(P)} \\ = 2 \frac{\mathrm{var}(G)}{2 \mathrm{var}(P)} \\ = \frac{\mathrm{var}(G)}{\mathrm{var}(P)} \\ = h^2 \]
Parent and offspring phenotypes become more highly correlated as heritability increases.
Mini review: What assumptions have we made when estimating \(h^2\)?
Heritability can also be estimated from resemblance between different types of related pairs. The general equation is:
\[ h^2 = \frac{b}{r} \]
Correlation of depression scores for different pairs of relatives
\[ \lambda_\mathrm{R} = \frac{P(\mathrm{affected} | \mathrm{relative affected})}{P(\mathrm{affected in population})} = \frac{K_\mathrm{R}}{K} \]
Example:
\[ \mathrm{cov}(Y, Y_\mathrm{R}) = E[YY_\mathrm{R}] - E[Y] E[Y_\mathrm{R}] \\ = K \times K_\mathrm{R} - K^2 \\ \]
\[ \mathrm{cov}(Y, Y_\mathrm{R}) = E[YY_\mathrm{R}] - E[Y] E[Y_\mathrm{R}] \]
\[ = K \times K_\mathrm{R} - K^2 \\ = K(K_\mathrm{R} - K) \\ = K^2 (\frac{K_\mathrm{R}}{K} - 1) \\ = K^2 (\lambda_\mathrm{R} - 1) \]
\[ h^2 = \frac{\mathrm{cov}_\mathrm{R}}{rV_\mathrm{P}} \\ = \frac{K^2 (\lambda_\mathrm{R} - 1)}{rK(1-K)} \\ = \frac{K (\lambda_\mathrm{R} - 1)}{r(1-K)} \\ \]
Contrast pairs of relatives that have comparable environmental similarity but different genetic similarity.
MZ twins: \(t(\mathrm{MZ}) = h^2 + c^2\)
DZ twins: \(t(\mathrm{DZ}) = \frac{1}{2}h^2 + c^2\)
\[t(\mathrm{MZ}) = h^2 + [t(\mathrm{DZ}) - \frac{1}{2}h^2]\] \[\frac{1}{2}h^2 = t(\mathrm{MZ}) - t(\mathrm{DZ})\] \[h^2 = 2[t(\mathrm{MZ}) - t(\mathrm{DZ})]\]
\[t(\mathrm{MZ}) = 2[t(\mathrm{MZ}) - t(\mathrm{DZ})] + c^2\] \[t(\mathrm{MZ}) = 2[t(\mathrm{MZ}) - t(\mathrm{DZ})] = c^2\] \[c^2 = t(\mathrm{MZ}) - 2t(\mathrm{MZ}) + 2t(\mathrm{DZ})\] \[c^2 = 2t(\mathrm{DZ}) - t(\mathrm{MZ})\]